Multi-interval Discretization Methods for Decision Tree Learning

نویسندگان

  • Petra Perner
  • Sascha Trautzsch
چکیده

Properly addressing the discretization process of continuos valued features is an important problem during decision tree learning. This paper describes four multi-interval discretization methods for induction of decision trees used in dynamic fashion. We compare two known discretization methods to two new methods proposed in this paper based on a histogram based method and a neural net based method (LVQ). We compare them according to accuracy of the resulting decision tree and to compactness of the tree. For our comparison we used three data bases, IRIS domain, satellite domain and OHS domain (ovariel hyper stimulation).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparision of Different Multi- Interval Discretization Methods for Decision Tree Learning

Properly addressing the discretization process of continous valued features is an important problem during decision tree learning. This paper describes four multi-interval discretization methods for induction of decision trees used in dynamic fashion. We compare two known discretization methods to two new methods proposed in this paper based on a histogram based method and a neural net based me...

متن کامل

Cost Sensitive Discretization of

Many algorithms in decision tree learning are not designed to handle numeric valued attributes very well. Therefore, discretization of the continuous feature space has to be carried out. In this article we introduce the concept of cost sensitive discretization as a preprocessing step to induction of a classifier and as an elaboration of the error-based discretization method to obtain an optimal...

متن کامل

Cost Sensitive Discretization of Numeric Attributes

Many algorithms in decision tree learning have not been designed to handle numerically-valued attributes very well. Therefore, discretization of the continuous feature space has to be carried out. In this article we introduce the concept of cost-sensitive discretization as a preprocessing step to induction of a classifier and as an elaboration of the error-based discretization method to obtain ...

متن کامل

MMDT: Multi-Objective Memetic Rule Learning from Decision Tree

In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...

متن کامل

Evaluating the performance of cost-based discretization versus entropy- and error-based discretization

Discretization is defined as the process that divides continuous numeric values into intervals of discrete categorical values. In this article, the concept of cost-based discretization as a pre-processing step to the induction of a classifier is introduced in order to obtain an optimal multi-interval splitting for each numeric attribute. A transparent description of the method and the steps inv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998